SENSEVAL: The evaluation of word sense disambiguation systems
نویسنده
چکیده
Word sense disambiguation (WSD) is the problem of deciding which sense a word has in any given context. The problem of doing WSD by computer is not new; it goes back to the early days of machine translation. But like other areas of computational linguistics, research into WSD has seen a resurgence because of the availability of large corpora. Statistical methods for WSD, especially techniques in machine learning, have proved to be very effective, as SENSEVAL has shown us. In many ways, WSD is similar to part-of-speech tagging. It involves labelling every word in a text with a tag from a pre-specified set of tag possibilities for each word by using features of the context and other information. Like part-of-speech tagging, no one really cares about WSD as a task on its own, but rather as part of a complete application in, for instance, machine translation or information retrieval. Thus, WSD is often fully integrated into applications and cannot be separated out (for instance, in information retrieval WSD is often not done explicitly but is just by-product of query to document matching). But in order to study and evaluate WSD, researchers have concentrated on standalone, generic systems for WSD. This article is not about methods or uses of WSD, but about evaluation.
منابع مشابه
An evaluation exercise for Romanian Word Sense Disambiguation
This paper presents the task definition, resources, participating systems, and comparative results for a Romanian Word Sense Disambiguation task, which was organized as part of the SENSEVAL-3 evaluation exercise. Five teams with a total of seven systems were drawn to this task.
متن کاملSENSEVAL-2: Overview
SENSEV AL-2: The Second International Workshop on Evaluating Word Sense Disambiguation Systems was held on July 5-6, 2001. This paper gives an overview of SENSEV AL-2, discussing the evaluation exercise, the tasks, the scoring system, and the results. It ends with some recommendations for future evaluation exercises.
متن کاملRegularized Least-Squares classification for Word Sense Disambiguation
The paper describes RLSC-LIN and RLSCCOMB systems which participated in the Senseval-3 English lexical sample task. These systems are based on Regularized Least-Squares Classification (RLSC) learning method. We describe the reasons of choosing this method, how we applied it to word sense disambiguation, what results we obtained on Senseval1, Senseval-2 and Senseval-3 data and discuss some possi...
متن کاملIntroduction to the Special Issue on SENSEVAL
SENSEVAL was the first open, community-based evaluation exercise for Word Sense Disambiguation programs. It took place in the summer of 1998, with tasks for English, French and Italian. There were participating systems from 23 research groups. This special issue is an account of the exercise. In addition to describing the contents of the volume, this introduction considers how the exercise has ...
متن کاملFinding optimal parameter settings for high performance word sense disambiguation
This article describes the four systems sent by the author to the SENSEVAL-3 contest, the English lexical sample task. The best recognition rate obtained by one of these systems was 72.9% (fine grain score) .
متن کاملThe University of Alicante systems at Senseval-3
The DLSI-UA team is currently working on several word sense disambiguation approaches, both supervised and unsupervised. These approaches are based on different ways to use both annotated and unannotated data, and several resources generated from or exploiting WordNet (Miller et al., 1993), WordNet Domains, EuroWordNet (EWN) and additional corpora. This paper presents a view of different system...
متن کامل